11 Sep 2024
The models I will be discussing today were implemented by my colleage Juho Timonen.
My contribution to this project is mainly in motivating the use case and applying it in practice.
Imagine you are analyzing data from an Oncology clinical trial.
We measure the size of patients’ tumors every 12 weeks, until they withdraw or stop therapy.
Key research question: Which treatment is more effective?
For this analysis, we will focus on the tumor-size data.
The typical approach is to model these using a non-linear function called the “Stein-Fojo” or SF model (Stein et al. 2008)
\[ f^{\text{SF}}(t \mid k_g, k_s) = \exp(k_g t) + \exp(-k_s t) - 1, \]
where \(t\) is a measure of time and \(k_g, k_s\) are unknown growth and shrinkage parameters, respectively.
As an example, this model describes the example data we showed earlier well. Here we fit this to 2 subjects at random.
We released an open-source R package sfgp to address these shortcomings.
This is largely the work of my colleage Juho Timonen.
We start with the likelihood of observation \(i\) as
\[ \log(y_i + \delta) \sim \mathcal{N}(f_i, \sigma^2), \]
where
\[ f_i = f(\mathbf{x}_i)= \sum_{j=1}^J f^{(j)}(\mathbf{x}_i) \]
is the expected log tumor size, \(\sigma\) is an unknown parameter, and the \(\delta\) value is a constant.
The functions \(f^{(j)}\), \(j=1, \ldots, J\) are the additive function components called terms in the package documentation.
The baseline value is estimated per subject by default
\[ f^{\text{BAS}} \left(\text{id} \mid \mathbf{c}_{0}\right) \]
The SF term sf(t, id) implements the SF function described previously
\[ f^{\text{SF}}(t \mid k_g, k_s) = \exp(k_g t) + \exp(-k_s t) - 1 \]
In this example, both \(k_g\) and \(k_s\) have an estimated overall mean and varying effect per id. This can also be defined as `sf(t, id | arm) to add additional hierarchies.
The gp(t) term implements a Hilbert-Space reduced-rank guassian process regression term (HSGP) (Solin and Särkkä 2020).
\[ f^{\text{HSGP}} \left(\text{t} \mid \mathbf{\xi}_{\text{t}}, \alpha_{\text{t}}, \ell_{\text{t}}, B_{\text{t}}, L_{\text{t}}\right) \]
The gp term can be defined per category, as gp(t, arm):
\[ f^{\text{HSGP}} \left(\text{t}, \text{arm} \mid \mathbf{\xi}^{(1)}_{\text{t} \times \text{arm}}, \ldots, \mathbf{\xi}^{(G_{arm})}_{\text{t} \times \text{arm}}, \alpha_{\text{t} \times \text{arm}}, \ell_{\text{t} \times \text{arm}}, B_{\text{t} \times \text{arm}}, L_{\text{t} \times \text{arm}}\right) \]
where \(G_{\text{arm}}\) is the number of treatment arms and GP kernel hyper-parameters are shared between groups (here, treatment arms).
We define a basic SF model with \(k_s\) and \(k_g\) terms varying by id and arm
\[\begin{align} f^{(1)}(\mathbf{x}) &= f^{\text{log-SF}} \left(\text{t} \mid \mathbf{k}_{g}, \mathbf{k}_{s}\right)\\ f^{(2)}(\mathbf{x}) &= f^{\text{BAS}} \left(\text{id} \mid \mathbf{c}_{0}\right) \end{align}\].
Next we define an SF+GP Model. This includes both an SF and a GP term in addition to the baseline term. It effectively treats the SF as the mean function for the GP:
\[\begin{align} f^{(1)}(\mathbf{x}) &= f^{\text{log-SF}} \left(\text{t} \mid \mathbf{k}_{g}, \mathbf{k}_{s}\right)\\ f^{(2)}(\mathbf{x}) &= f^{\text{HSGP}} \left(\text{t}, \text{arm} \mid \mathbf{\xi}^{(1)}_{\text{t} \times \text{arm}}, \ldots, \mathbf{\xi}^{(G_{arm})}_{\text{t} \times \text{arm}}, \alpha_{\text{t} \times \text{arm}}, \ell_{\text{t} \times \text{arm}}, B_{\text{t} \times \text{arm}}, L_{\text{t} \times \text{arm}}\right)\\ f^{(3)}(\mathbf{x}) &= f^{\text{BAS}} \left(\text{id} \mid \mathbf{c}_{0}\right) \end{align}\]
Finally, we define an alternative SF+GP Model to show how both the \(k_s\) and \(k_g\) term sub-components can be specified.
TSModel$new(
y ~ sff(
t | ks ~ offset(id_ks | arm_ks) + gp(t_ks, arm_ks),
kg ~ offset(id_kg | arm_kg) + gp(t_kg, arm_kg)
)
)It has the terms
\[\begin{align} f^{(1)}(\mathbf{x}) &= f^{\text{log-SF}} \left(\text{t} \mid \mathbf{k}_{g}, \mathbf{k}_{s}\right)\\ f^{(2)}(\mathbf{x}) &= f^{\text{BAS}} \left(\text{id} \mid \mathbf{c}_{0}\right) \end{align}\]
However each of \(\mathbf{k}_{g}\) and \(\mathbf{k}_{s}\) are comprised of sub-component terms per ID | arm and HSGP(t, arm).
An important feature of these data is the determination of “clinical progression”
When scoring models using PSIS-LOO ELPD, we see a clear difference in model performance
| Model | elpd_diff | se_diff |
|---|---|---|
| Alt SF+GP | 0.0000 | 0.00000 |
| SF+GP | -594.1177 | 43.88187 |
| SF | -1549.4241 | 42.48850 |
Can be simulated with treatment_effects method.
StanCon 2024